AITopics | Arabian Sea

Collaborating Authors

Arabian Sea

Large Causal Models from Large Language Models

arXiv.org Artificial IntelligenceDec-9-2025

We introduce a new paradigm for building large causal models (LCMs) that exploits the enormous potential latent in today's large language models (LLMs). We describe our ongoing experiments with an implemented system called DEMOCRITUS (Decentralized Extraction of Manifold Ontologies of Causal Relations Integrating Topos Universal Slices) aimed at building, organizing, and visualizing LCMs that span disparate domains extracted from carefully targeted textual queries to LLMs. DEMOCRITUS is methodologically distinct from traditional narrow domain and hypothesis centered causal inference that builds causal models from experiments that produce numerical data. A high-quality LLM is used to propose topics, generate causal questions, and extract plausible causal statements from a diverse range of domains. The technical challenge is then to take these isolated, fragmented, potentially ambiguous and possibly conflicting causal claims, and weave them into a coherent whole, converting them into relational causal triples and embedding them into a LCM. Addressing this technical challenge required inventing new categorical machine learning methods, which we can only briefly summarize in this paper, as it is focused more on the systems side of building DEMOCRITUS. We describe the implementation pipeline for DEMOCRITUS comprising of six modules, examine its computational cost profile to determine where the current bottlenecks in scaling the system to larger models. We describe the results of using DEMOCRITUS over a wide range of domains, spanning archaeology, biology, climate change, economics, medicine and technology. We discuss the limitations of the current DEMOCRITUS system, and outline directions for extending its capabilities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.07796

Country:

Africa > Middle East > Egypt (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Government (1.00)
Banking & Finance > Economy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Learning-Driven Downscaling for Climate Risk Assessment of Projected Temperature Extremes in the Nordic Region

Loganathan, Parthiban, Zea, Elias, Vinuesa, Ricardo, Otero, Evelyn

arXiv.org Artificial IntelligenceNov-7-2025

Rapid changes and increasing climatic variability across the widely varied Koppen-Geiger regions of northern Europe generate significant needs for adaptation. Regional planning needs high-resolution projected temperatures. This work presents an integrative downscaling framework that incorporates Vision Transformer (ViT), Convolutional Long Short-Term Memory (ConvLSTM), and Geospatial Spatiotemporal Transformer with Attention and Imbalance-Aware Network (GeoStaNet) models. The framework is evaluated with a multicriteria decision system, Deep Learning-TOPSIS (DL-TOPSIS), for ten strategically chosen meteorological stations encompassing the temperate oceanic (Cfb), subpolar oceanic (Cfc), warm-summer continental (Dfb), and subarctic (Dfc) climate regions. Norwegian Earth System Model (NorESM2-LM) Coupled Model Intercomparison Project Phase 6 (CMIP6) outputs were bias-corrected during the 1951-2014 period and subsequently validated against earlier observations of day-to-day temperature metrics and diurnal range statistics. The ViT showed improved performance (Root Mean Squared Error (RMSE): 1.01 degrees C; R^2: 0.92), allowing for production of credible downscaled projections. Under the SSP5-8.5 scenario, the Dfc and Dfb climate zones are projected to warm by 4.8 degrees C and 3.9 degrees C, respectively, by 2100, with expansion in the diurnal temperature range by more than 1.5 degrees C. The Time of Emergence signal first appears in subarctic winter seasons (Dfc: approximately 2032), signifying an urgent need for adaptation measures. The presented framework offers station-based, high-resolution estimates of uncertainties and extremes, with direct uses for adaptation policy over high-latitude regions with fast environmental change.

artificial intelligence, machine learning, projection, (18 more...)

arXiv.org Artificial Intelligence

2511.0377

Country:

Europe > Northern Europe (0.24)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
(18 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.50)
Transportation (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards LLM Agents for Earth Observation

Kao, Chia Hsiang, Zhao, Wenting, Revankar, Shreelekha, Speas, Samuel, Bhagat, Snehal, Datta, Rajeev, Phoo, Cheng Perng, Mall, Utkarsh, Vondrick, Carl, Bala, Kavita, Hariharan, Bharath

arXiv.org Artificial IntelligenceSep-16-2025

Earth Observation (EO) provides critical planetary data for environmental monitoring, disaster management, climate science, and other scientific domains. Here we ask: Are AI systems ready for reliable Earth Observation? We introduce \datasetnamenospace, a benchmark of 140 yes/no questions from NASA Earth Observatory articles across 13 topics and 17 satellite sensors. Using Google Earth Engine API as a tool, LLM agents can only achieve an accuracy of 33% because the code fails to run over 58% of the time. We improve the failure rate for open models by fine-tuning synthetic data, allowing much smaller models (Llama-3.1-8B) to achieve comparable accuracy to much larger ones (e.g., DeepSeek-R1). Taken together, our findings identify significant challenges to be solved before AI agents can automate earth observation, and suggest paths forward. The project page is available at https://iandrover.github.io/UnivEarth.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.1211

Country:

South America > Argentina > Argentine Northwest > Salta Province (0.04)
Asia > China (0.04)
Africa > Middle East > Somalia (0.04)
(7 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Energy (0.47)
Government > Space Agency (0.36)
Government > Regional Government > North America Government > United States Government (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Learning county from pixels: Corn yield prediction with attention-weighted multiple instance learning

Wang, Xiaoyu, Ma, Yuchi, Huang, Qunying, Yang, Zhengwei, Zhang, Zhou

arXiv.org Artificial IntelligenceDec-1-2023

Remote sensing technology has become a promising tool in yield prediction. Most prior work employs satellite imagery for county-level corn yield prediction by spatially aggregating all pixels within a county into a single value, potentially overlooking the detailed information and valuable insights offered by more granular data. To this end, this research examines each county at the pixel level and applies multiple instance learning to leverage detailed information within a county. In addition, our method addresses the "mixed pixel" issue caused by the inconsistent resolution between feature datasets and crop mask, which may introduce noise into the model and therefore hinder accurate yield prediction. Specifically, the attention mechanism is employed to automatically assign weights to different pixels, which can mitigate the influence of mixed pixels. The experimental results show that the developed model outperforms four other machine learning models over the past five years in the U.S. corn belt and demonstrates its best performance in 2022, achieving a coefficient of determination (R2) value of 0.84 and a root mean square error (RMSE) of 0.83. This paper demonstrates the advantages of our approach from both spatial and temporal perspectives. Furthermore, through an in-depth study of the relationship between mixed pixels and attention, it is verified that our approach can capture critical feature information while filtering out noise from mixed pixels.

artificial intelligence, machine learning, pixel, (17 more...)

arXiv.org Artificial Intelligence

2312.01001

Country:

North America > United States > North Dakota (0.05)
North America > United States > Kansas (0.05)
North America > United States > Minnesota (0.04)
(15 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture > Agriculture (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Implementing a new fully stepwise decomposition-based sampling technique for the hybrid water level forecasting model in real-world application

Zhang, Ziqian, Bao, Nana, Yan, Xingting, Zhu, Aokai, Li, Chenyang, Liu, Mingyu

arXiv.org Artificial IntelligenceSep-19-2023

Various time variant non-stationary signals need to be pre-processed properly in hydrological time series forecasting in real world, for example, predictions of water level. Decomposition method is a good candidate and widely used in such a pre-processing problem. However, decomposition methods with an inappropriate sampling technique may introduce future data which is not available in practical applications, and result in incorrect decomposition-based forecasting models. In this work, a novel Fully Stepwise Decomposition-Based (FSDB) sampling technique is well designed for the decomposition-based forecasting model, strictly avoiding introducing future information. This sampling technique with decomposition methods, such as Variational Mode Decomposition (VMD) and Singular spectrum analysis (SSA), is applied to predict water level time series in three different stations of Guoyang and Chaohu basins in China. Results of VMD-based hybrid model using FSDB sampling technique show that Nash-Sutcliffe Efficiency (NSE) coefficient is increased by 6.4%, 28.8% and 7.0% in three stations respectively, compared with those obtained from the currently most advanced sampling technique. In the meantime, for series of SSA-based experiments, NSE is increased by 3.2%, 3.1% and 1.1% respectively. We conclude that the newly developed FSDB sampling technique can be used to enhance the performance of decomposition-based hybrid model in water level time series forecasting in real world.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2309.10658

Country:

Asia > China > Anhui Province > Hefei (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (0.68)

Industry: Energy (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Kernel Learning for Explainable Climate Science

Lalchand, Vidhi, Tazi, Kenza, Cheema, Talay M., Turner, Richard E., Hosking, Scott

arXiv.org Artificial IntelligenceJul-16-2023

The Upper Indus Basin, Himalayas provides water for 270 million people and countless ecosystems. However, precipitation, a key component to hydrological modelling, is poorly understood in this area. A key challenge surrounding this uncertainty comes from the complex spatial-temporal distribution of precipitation across the basin. In this work we propose Gaussian processes with structured non-stationary kernels to model precipitation patterns in the UIB. Previous attempts to quantify or model precipitation in the Hindu Kush Karakoram Himalayan region have often been qualitative or include crude assumptions and simplifications which cannot be resolved at lower resolutions. This body of research also provides little to no error propagation. We account for the spatial variation in precipitation with a non-stationary Gibbs kernel parameterised with an input dependent lengthscale. This allows the posterior function samples to adapt to the varying precipitation patterns inherent in the distinct underlying topography of the Indus region. The input dependent lengthscale is governed by a latent Gaussian process with a stationary squared-exponential kernel to allow the function level hyperparameters to vary smoothly. In ablation experiments we motivate each component of the proposed kernel by demonstrating its ability to model the spatial covariance, temporal structure and joint spatio-temporal reconstruction. We benchmark our model with a stationary Gaussian process and a Deep Gaussian processes.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Artificial Intelligence

2209.04947

Country:

Asia > Pakistan > Arabian Sea (0.25)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.15)
Asia > China (0.05)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Spatial-temporal Multi-Task Learning for Within-field Cotton Yield Prediction

Nguyen, Long, Zhen, Jia, Lin, Zhe, Du, Hanxiang, Yang, Zhou, Guo, Wenxuan, Jin, Fang

arXiv.org Machine LearningNov-15-2018

Understanding and accurately predicting within-field spatial variability of crop yield play a key role in site-specific management of crop inputs such as irrigation water and fertilizer for optimized crop production. However, such a task is challenged by the complex interaction between crop growth and environmental and managerial factors, such as climate, soil conditions, tillage, and irrigation. In this paper, we present a novel Spatial-temporal Multi-Task Learning algorithms for within-field crop yield prediction in west Texas from 2001 to 2003. This algorithm integrates multiple heterogeneous data sources to learn different features simultaneously, and to aggregate spatial-temporal features by introducing a weighted regularizer to the loss functions. Our comprehensive experimental results consistently outperform the results of other conventional methods, and suggest a promising approach, which improves the landscape of crop prediction research fields.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

1811.06665

Country:

Europe > United Kingdom > North Sea > Southern North Sea (0.50)
North America > United States > Texas > Lubbock County > Lubbock (0.04)
Asia > Pakistan > Arabian Sea (0.04)
Asia > Middle East > Republic of Türkiye (0.04)

Genre: Research Report (1.00)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback